Skip to content

Conversation

@jiqing-feng
Copy link
Contributor

@jiqing-feng jiqing-feng commented Jan 9, 2026

Fix XPU 4-bit kernel for odd-sized weights and resolve related test failures.
The original XPU 4-bit kernel did not handle odd weight shapes correctly. This PR fixes the logic and resolves failures in tests such as test_4bit_quant_large.

For pytest -k "xpu" -ra ./:

2495 passed, 1335 skipped, 3998 deselected, 24 xfailed, 46 warnings

@jiqing-feng
Copy link
Contributor Author

Hi @matthewdouglas . Would you please review this PR? Thanks!

Signed-off-by: jiqing-feng <[email protected]>
@matthewdouglas matthewdouglas added this to the v0.49.2 milestone Jan 12, 2026
@matthewdouglas
Copy link
Member

Hi,

Because this moves to int64 for indexing, I assume direct uses like this one in Unsloth would need to be updated?

https://github.com/unslothai/unsloth/blob/ec1757c1a02175851146ff5f6ab2a26c8c863fc8/unsloth/kernels/utils.py#L438-L452

@jiqing-feng jiqing-feng marked this pull request as draft January 13, 2026 04:59
@jiqing-feng jiqing-feng marked this pull request as ready for review January 13, 2026 05:22
@jiqing-feng
Copy link
Contributor Author

Hi @matthewdouglas . I have updated the kernels so they can cast the dtype inside the kernel without api change. Please review the new changes. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants